On the boosting ability of top-down decision tree learning algorithm for multiclass classification
نویسندگان
چکیده
We analyze the performance of the top-down multiclass classification algorithm for decision tree learning called LOMtree, recently proposed in the literature Choromanska and Langford (2014) for solving efficiently classification problems with very large number of classes. The algorithm online optimizes the objective function which simultaneously controls the depth of the tree and its statistical accuracy. We prove important properties of this objective and explore its connection to three well-known entropy-based decision tree objectives, i.e. Shannon entropy, Gini-entropy and its modified version, for which instead online optimization schemes were not yet developed. We show, via boosting-type guarantees, that maximizing the considered objective leads also to the reduction of all of these entropy-based objectives. The bounds we obtain critically depend on the strong-concavity properties of the entropy-based criteria, where the mildest dependence on the number of classes (only logarithmic) corresponds to the Shannon entropy.
منابع مشابه
Real Boosting a la Carte with an Application to Boosting Oblique Decision Tree
In the past ten years, boosting has become a major field of machine learning and classification. This paper brings contributions to its theory and algorithms. We first unify a well-known top-down decision tree induction algorithm due to [Kearns and Mansour, 1999], and discrete AdaBoost [Freund and Schapire, 1997], as two versions of a same higher-level boosting algorithm. It may be used as the ...
متن کاملBoosting with Multi-Way Branching in Decision Trees
It is known that decision tree learning can be viewed as a form of boosting. However, existing boosting theorems for decision tree learning allow only binary-branching trees and the generalization to multi-branching trees is not immediate. Practical decision tree algorithms, such as CART and C4.5, implement a trade-off between the number of branches and the improvement in tree quality as measur...
متن کاملMulticlass Alternating Decision Trees
The alternating decision tree (ADTree) is a successful classification technique that combines decision trees with the predictive accuracy of boosting into a set of interpretable classification rules. The original formulation of the tree induction algorithm restricted attention to binary classification problems. This paper empirically evaluates several wrapper methods for extending the algorithm...
متن کاملEfficient Multiclass Boosting Classification with Active Learning
We propose a novel multiclass classification algorithm Gentle Adaptive Multiclass Boosting Learning (GAMBLE). The algorithm naturally extends the two class Gentle AdaBoost algorithm to multiclass classification by using the multiclass exponential loss and the multiclass response encoding scheme. Unlike other multiclass algorithms which reduce the K-class classification task to K binary classifi...
متن کاملBOAI: Fast Alternating Decision Tree Induction Based on Bottom-Up Evaluation
Alternating Decision Tree (ADTree) is a successful classification model based on boosting and has a wide range of applications. The existing ADTree induction algorithms apply a “top-down” strategy to evaluate the best split at each boosting iteration, which is very time-consuming and thus is unsuitable for modeling on large data sets. This paper proposes a fast ADTree induction algorithm (BOAI)...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1605.05223 شماره
صفحات -
تاریخ انتشار 2016